美国的意识形态分裂在日常交流中变得越来越突出。因此,关于政治两极分化的许多研究,包括最近采取计算观点的许多努力。通过检测文本语料库中的政治偏见,可以尝试描述和辨别该文本的两极分性。从直觉上讲,命名的实体(即,用作名词的名词和短语)和文本中的标签经常带有有关政治观点的信息。例如,使用“支持选择”一词的人可能是自由的,而使用“亲生生命”一词的人可能是保守的。在本文中,我们试图揭示社交媒体文本数据中的政治极性,并通过将极性得分分配给实体和标签来量化这些极性。尽管这个想法很简单,但很难以可信赖的定量方式进行这种推论。关键挑战包括少数已知标签,连续的政治观点,以及在嵌入单词媒介中的极性得分和极性中性语义含义的保存。为了克服这些挑战,我们提出了极性感知的嵌入多任务学习(PEM)模型。该模型包括(1)自制的上下文保护任务,(2)基于注意力的推文级别的极性推导任务,以及(3)对抗性学习任务,可促进嵌入式的极性维度及其语义之间的独立性方面。我们的实验结果表明,我们的PEM模型可以成功学习极性感知的嵌入。我们检查了各种应用,从而证明了PEM模型的有效性。我们还讨论了我们的工作的重要局限性,并在将PEM模型应用于现实世界情景时的压力谨慎。
translated by 谷歌翻译
技术的改进与时间和时间相关的问题线性相关。已经看到,随着时间的推移,人类面临的问题数量也会增加。然而,解决这些问题的技术也往往会改善。最早的现有问题之一开始于车辆的发明内容是停车位。多年来,使用技术的易于解决这个问题已经发展,但停车问题仍然仍未解决。这背后的主要原因是停车不仅涉及一个问题,而且它包括一系列问题。其中一个问题是分布式停车生态系统中停车槽的占用检测。在分布式系统中,用户将找到优选的停车位,而不是随机停车位。在本文中,我们将基于Web的应用提出了一种用于在不同停车位停车空间检测的解决方案。该解决方案基于计算机视觉(CV),并使用Python 3.0中编写的Django框架构建。解决方案用于解决占用检测问题以及提供用户基于可用性和偏好确定块的选项。我们提出的系统的评估结果是有前途和有效的。所提出的系统也可以与不同的系统集成,并用于解决其他相关停车问题。
translated by 谷歌翻译
机器学习从业者通常可以访问数据的频谱:目标任务(通常是有限),未标记的数据和辅助数据的标记数据,用于其他任务的许多可用标记的数据集。我们描述了TAGLET,一个系统为学习技术,用于自动利用所有三种类型的数据并创建高质量的可服装分类器。 TAGLET的关键组件是:(1)根据知识图组织组织的辅助数据,(2)封装用于利用辅助和未标记数据的不同方法的模块,以及(3)将被整合模块组合成可用的蒸馏阶段模型。我们将TAGLETS与最先进的传输学习和半监督学习方法进行比较,四个图像分类任务。我们的研究涵盖了一系列设置,改变了标记数据的量和辅助数据的语义相关性到目标任务。我们发现,辅助和未标记数据的智能融合到多个学习技术使Taglet能够匹配 - 并且最常见的是这些替代方案。 Taglets可作为Github.com/batsresearch/taglet的开源系统使用。
translated by 谷歌翻译
The task of reconstructing 3D human motion has wideranging applications. The gold standard Motion capture (MoCap) systems are accurate but inaccessible to the general public due to their cost, hardware and space constraints. In contrast, monocular human mesh recovery (HMR) methods are much more accessible than MoCap as they take single-view videos as inputs. Replacing the multi-view Mo- Cap systems with a monocular HMR method would break the current barriers to collecting accurate 3D motion thus making exciting applications like motion analysis and motiondriven animation accessible to the general public. However, performance of existing HMR methods degrade when the video contains challenging and dynamic motion that is not in existing MoCap datasets used for training. This reduces its appeal as dynamic motion is frequently the target in 3D motion recovery in the aforementioned applications. Our study aims to bridge the gap between monocular HMR and multi-view MoCap systems by leveraging information shared across multiple video instances of the same action. We introduce the Neural Motion (NeMo) field. It is optimized to represent the underlying 3D motions across a set of videos of the same action. Empirically, we show that NeMo can recover 3D motion in sports using videos from the Penn Action dataset, where NeMo outperforms existing HMR methods in terms of 2D keypoint detection. To further validate NeMo using 3D metrics, we collected a small MoCap dataset mimicking actions in Penn Action,and show that NeMo achieves better 3D reconstruction compared to various baselines.
translated by 谷歌翻译
We propose AnyTOD, an end-to-end task-oriented dialog (TOD) system with zero-shot capability for unseen tasks. We view TOD as a program executed by a language model (LM), where program logic and ontology is provided by a designer in the form of a schema. To enable generalization onto unseen schemas and programs without prior training, AnyTOD adopts a neuro-symbolic approach. A neural LM keeps track of events that occur during a conversation, and a symbolic program implementing the dialog policy is executed to recommend next actions AnyTOD should take. This approach drastically reduces data annotation and model training requirements, addressing a long-standing challenge in TOD research: rapidly adapting a TOD system to unseen tasks and domains. We demonstrate state-of-the-art results on the STAR and ABCD benchmarks, as well as AnyTOD's strong zero-shot transfer capability in low-resource settings. In addition, we release STARv2, an updated version of the STAR dataset with richer data annotations, for benchmarking zero-shot end-to-end TOD models.
translated by 谷歌翻译
A long-standing goal of machine-learning-based protein engineering is to accelerate the discovery of novel mutations that improve the function of a known protein. We introduce a sampling framework for evolving proteins in silico that supports mixing and matching a variety of unsupervised models, such as protein language models, and supervised models that predict protein function from sequence. By composing these models, we aim to improve our ability to evaluate unseen mutations and constrain search to regions of sequence space likely to contain functional proteins. Our framework achieves this without any model fine-tuning or re-training by constructing a product of experts distribution directly in discrete protein space. Instead of resorting to brute force search or random sampling, which is typical of classic directed evolution, we introduce a fast MCMC sampler that uses gradients to propose promising mutations. We conduct in silico directed evolution experiments on wide fitness landscapes and across a range of different pre-trained unsupervised models, including a 650M parameter protein language model. Our results demonstrate an ability to efficiently discover variants with high evolutionary likelihood as well as estimated activity multiple mutations away from a wild type protein, suggesting our sampler provides a practical and effective new paradigm for machine-learning-based protein engineering.
translated by 谷歌翻译
Most research on task oriented dialog modeling is based on written text input. However, users interact with practical dialog systems often using speech as input. Typically, systems convert speech into text using an Automatic Speech Recognition (ASR) system, introducing errors. Furthermore, these systems do not address the differences in written and spoken language. The research on this topic is stymied by the lack of a public corpus. Motivated by these considerations, our goal in hosting the speech-aware dialog state tracking challenge was to create a public corpus or task which can be used to investigate the performance gap between the written and spoken forms of input, develop models that could alleviate this gap, and establish whether Text-to-Speech-based (TTS) systems is a reasonable surrogate to the more-labor intensive human data collection. We created three spoken versions of the popular written-domain MultiWoz task -- (a) TTS-Verbatim: written user inputs were converted into speech waveforms using a TTS system, (b) Human-Verbatim: humans spoke the user inputs verbatim, and (c) Human-paraphrased: humans paraphrased the user inputs. Additionally, we provided different forms of ASR output to encourage wider participation from teams that may not have access to state-of-the-art ASR systems. These included ASR transcripts, word time stamps, and latent representations of the audio (audio encoder outputs). In this paper, we describe the corpus, report results from participating teams, provide preliminary analyses of their results, and summarize the current state-of-the-art in this domain.
translated by 谷歌翻译
As AI systems become more capable, we would like to enlist their help to supervise other AIs. We experiment with methods for training a harmless AI assistant through self-improvement, without any human labels identifying harmful outputs. The only human oversight is provided through a list of rules or principles, and so we refer to the method as 'Constitutional AI'. The process involves both a supervised learning and a reinforcement learning phase. In the supervised phase we sample from an initial model, then generate self-critiques and revisions, and then finetune the original model on revised responses. In the RL phase, we sample from the finetuned model, use a model to evaluate which of the two samples is better, and then train a preference model from this dataset of AI preferences. We then train with RL using the preference model as the reward signal, i.e. we use 'RL from AI Feedback' (RLAIF). As a result we are able to train a harmless but non-evasive AI assistant that engages with harmful queries by explaining its objections to them. Both the SL and RL methods can leverage chain-of-thought style reasoning to improve the human-judged performance and transparency of AI decision making. These methods make it possible to control AI behavior more precisely and with far fewer human labels.
translated by 谷歌翻译
Detection Transformer (DETR) directly transforms queries to unique objects by using one-to-one bipartite matching during training and enables end-to-end object detection. Recently, these models have surpassed traditional detectors on COCO with undeniable elegance. However, they differ from traditional detectors in multiple designs, including model architecture and training schedules, and thus the effectiveness of one-to-one matching is not fully understood. In this work, we conduct a strict comparison between the one-to-one Hungarian matching in DETRs and the one-to-many label assignments in traditional detectors with non-maximum supervision (NMS). Surprisingly, we observe one-to-many assignments with NMS consistently outperform standard one-to-one matching under the same setting, with a significant gain of up to 2.5 mAP. Our detector that trains Deformable-DETR with traditional IoU-based label assignment achieved 50.2 COCO mAP within 12 epochs (1x schedule) with ResNet50 backbone, outperforming all existing traditional or transformer-based detectors in this setting. On multiple datasets, schedules, and architectures, we consistently show bipartite matching is unnecessary for performant detection transformers. Furthermore, we attribute the success of detection transformers to their expressive transformer architecture. Code is available at https://github.com/jozhang97/DETA.
translated by 谷歌翻译
Noninvasive X-ray imaging of nanoscale three-dimensional objects, e.g. integrated circuits (ICs), generally requires two types of scanning: ptychographic, which is translational and returns estimates of complex electromagnetic field through ICs; and tomographic scanning, which collects complex field projections from multiple angles. Here, we present Attentional Ptycho-Tomography (APT), an approach trained to provide accurate reconstructions of ICs despite incomplete measurements, using a dramatically reduced amount of angular scanning. Training process includes regularizing priors based on typical IC patterns and the physics of X-ray propagation. We demonstrate that APT with 12-time reduced angles achieves fidelity comparable to the gold standard with the original set of angles. With the same set of reduced angles, APT also outperforms baseline reconstruction methods. In our experiments, APT achieves 108-time aggregate reduction in data acquisition and computation without compromising quality. We expect our physics-assisted machine learning framework could also be applied to other branches of nanoscale imaging.
translated by 谷歌翻译